





# End-to-end network programmability

From core switches to end hosts

Gregorio Procissi Giuseppe Lettieri gregorio.procissi@unipi.it
giuseppe.lettieri@unipi.it





#### Agenda

#### Network programmability all over the network

- Part I (Gregorio Procissi)
  - From original SDN to programmable network data-plane
  - o In-network computing: programmable switches and the P4 language
  - o P4 in practice: running a programmable software switch in an emulated network environment

- Part II (Giuseppe Lettieri)
  - End-host computing
    - in-kernel networking with extended Berkeley Packet Filter (eBPF)
  - eBPF in practice
    - programming simple applications on a Linux machine







# Part I

Gregorio Procissi

In-network data plane programmability







#### Data Plane/Control Plane...

- Data Plane (Forwarding plane)
  - Processing of data packets, in particular forwarding

- Control plane
  - Intelligence of the network, defines how to handle packets
  - o example: routing







#### ... In traditional networks

- Data and control planes vertically integrated
  - o in every network node
  - control plane distributed across nodes (routers)
- Destination based forwarding









#### ... and in Software Define Networks

Data plane and control plane are logically separated









## **Software Defined Networking (SDN)**

- Flow based forwarding
  - flow: set of header fields values used as matching rules
  - o **actions**: to be applied to packets belonging to the same flow

- Extended concept of forwarding node:
  - o a switch may be **instructed** to be a router, firewall, load-balancer, and so on

- At this stage network applications run on top of the SDN controller
  - o just as processes run on top of operating systems
  - o programmability is quite classic
  - does it **scale** at high traffic rates...?







#### Openflow

- The first successful incarnation of the SDN paradigm, from Stanford University
- from OF paper (2008): "amenable to high performance and low-cost implementation; capable of supporting a broad range of research; and consistent with vendors' need for closed platforms"
- Essentially, OpenFlow is an API to a switch flow table









# **OpenFlow tables (examples)**

#### L2 switch

| Switch<br>Port | Mac src | Mac<br>dst        | Eth<br>type | IP<br>src | IP<br>dst | TCP<br>s_port | TCP<br>d_port | Action                |
|----------------|---------|-------------------|-------------|-----------|-----------|---------------|---------------|-----------------------|
| *              | *       | 68:5b:35:83:c8:56 | *           | *         | *         | *             | *             | Forward to<br>Port P2 |

#### L3 forwarding

| Switch<br>Port | Mac src | Mac<br>dst | Eth type | IP<br>src | IP<br>dst    | TCP<br>s_port | TCP<br>d_port | Action                |  |
|----------------|---------|------------|----------|-----------|--------------|---------------|---------------|-----------------------|--|
| *              | *       | *          | 0x0800   | *         | 131.114.52.* | *             | *             | Forward to<br>Port P3 |  |

#### **L4 Firewall**

| Switch<br>Port | Mac src | Mac<br>dst | Eth type | IP<br>src    | IP<br>dst | TCP<br>s_port | TCP<br>d_port | Action |
|----------------|---------|------------|----------|--------------|-----------|---------------|---------------|--------|
| *              | *       | *          | 0x0800   | 131.114.52.* | *         | *             | 6969          | Drop   |





#### **Match-Action Tables**

- Match-Action Table
  - o Match:
    - binary match (0,1)
    - ternary match (0,1,\*)
  - Action:
    - drop, forward, modify, go to another table

| Switch<br>Port | Mac src | Mac<br>dst        | Eth<br>type | IP<br>src | IP<br>dst | TCP<br>s_port | TCP<br>d_port | Action                |
|----------------|---------|-------------------|-------------|-----------|-----------|---------------|---------------|-----------------------|
| *              | *       | 68:5b:35:83:c8:56 | *           | *         | *         | *             | *             | Forward to<br>Port P2 |







#### **OpenFlow Switching platforms**

Commodity CPU: software switch (OVS, OFSoftSwitch, etc...)



FPGA, NetFPGA



Network Processor (NPU)



- Switching chips
  - ~ 100x faster than CPUs
  - ~ 10x faster than NPU









#### Is it really data plane programming?

- To add expressiveness the protocol has become more and more complex
  - 12 fields match on OF 1.0 ---> 41 in OF 1.4
- **Fixed** type tables
- No custom actions upon matching
  - only a set of "pre-cooked" actions
- No room for defining custom state variables (across different packets)

enum ofp\_action\_type { OFPAT OUTPUT. OFPAT COPY TTL OUT. OFPAT COPY TTL IN. OFPAT\_SET\_MPLS\_TTL, OFPAT DEC MPLS TTL OFPAT\_PUSH\_VLAN, OFPAT POP VLAN. OFPAT PUSH MPLS. OFPAT POP\_MPLS, OFPAT\_SET\_QUEUE, OFPAT GROUP. OFPAT\_SET\_NW\_TTL, OFPAT DEC NW TTL. OFPAT SET FIELD. OFPAT PUSH PBB. OFPAT\_POP\_PBB, OFPAT\_EXPERIMENTER







#### Deeper network programmability

- Why not defining flexible parsers to match arbitrary header fields?
- Why not defining custom tables for matching rules?
- Why not implementing custom actions to be applied upon matching rules?
- Why not supporting custom stateful operations?
- The answer... again came from Stanford University...
   and is... Programming Protocol-independent
   Packet Processors + PISA

#### P4: Programming Protocol-Independent Packet Processors

Pat Bosshart\*, Dan Daly\*, Glen Gibb¹, Martin Izzard', Nick McKeown\*, Jennifer Rexford\*\*, Cole Schlesinger\*\*, Dan Talayco\*, Amin Vahdat\*, George Varghese\*, David Walker\*\*

[Barefoot Networks "Intel "Stanford University "Princeton University "Google "Microsoft Research

#### ABSTRACT

P4 is a high-level language for programming protocol-independent packet processors. P4 works in conjunction with SDN control protocols like OpenFlow. In its current form, OpenFlow explicitly specifies protocol headers on which it operates. This set has grown from 12 to 41 fields in a few years, increasing the complexity of the specification while still not providing the flexibility to add new headers. In this paper we propose P4 as a strawman proposal for how Open-Flow should evolve in the future. We have three goals: (1) Reconfigurability in the field: Programmers should be able to change the way switches process packets once they are deployed. (2) Protocol independence: Switches should not be tied to any specific network protocols. (3) Target independence: Programmers should be able to describe packetprocessing functionality independently of the specifics of the underlying hardware. As an example, we describe how to use P4 to configure a switch to add a new hierarchical label

#### 1. INTRODUCTION

OF 1.4 Oct 2013 41 fields

Software-Defined Networking (SDN) gives operators programmatic control over their networks. In SDN, the control plane is physically separate from the forwarding plane, and one control plane controls multiple forwarding devices. While forwarding devices could be programmed in many ways, having a common, open, vendor-agnostic interface (like OpenFlow) enables a control plane to control forwarding devices from different hardware and software wendors.

| Version | Date     | Header Fields                          |
|---------|----------|----------------------------------------|
| OF 1.0  | Dec 2009 | 12 fields (Ethernet, TCP/IPv4)         |
| OF 1.1  | Feb 2011 | 15 fields (MPLS, inter-table metadata) |
| OF 1.2  | Dec 2011 | 36 fields (ARP, ICMP, IPv6, etc.)      |
| OF 1.3  | Jun 2012 | 40 fields                              |

Table 1: Fields recognized by the OpenFlow standard

The OpenFlow interface started simple, with the abstraction of a single table of rules that could match packets on a dozen header fields (e.g., MAc addresses, protocol, TCP/UDP port numbers, etc.). Over the past five years, the specification has grown increasingly more complicated (see Table 1), with many more header fields and multiple stages of rule tables, to allow switches to expose more of their capabilities to the controller.

The proliferation of new header fields shown so signs of stopping. For compile, data-center network operators in creasingly want to apply new forms of packet encapsulation (e.g., NVGIR, VALIA, and STT), for which they resort to deploying software switches that are caselir to extend with new functionality. Rother than repostedly extending should support flexible mechanisms for parsing packet and matching honder fields, allowing controller applications to leverage these capabilities through a common, open interface (i.e., a new "OpenFave 20" ATI"). Such a general, extensible approach would be simpler, more degant, and more future-proof than today's OpenFave 12 at standard.



Figure 1: P4 is a language to configure switches

Recent chip designs demonstrate that such flexibility can be achieved in castion ASICs at treaths people [1, 2, 8]. Programming this new generation of switch chips is far from casy. Each chip has its own low-level interface, akin to microscole programming. In this paper, we sheet the design of a higher-level langages for Pragramming Proteoniindependent Packet Processors (Pt). Figure 1 shows the relationship between Per-used to configure a switch, telling it how packets are to be processed—and existing APIs (such as OpenHow) that are designed to populate the forwarding tables in fixed function switches. Per raises the level of abstraction for programming the network, and can serve as a

ACM SIGCOMM Computer Communication Review

0

Volume 44, Number 3, July 2014







#### Reconfigurable Match Tables (RMTs)

- Reconfigurable Match Table (RMT)
  - Set of pipelined stages each with a match table of arbitrary depth and width
    - ex: IP match: 32-bit depth, 256k width
    - ex: Ethernet: 48-bit addresses width, 64k width
- RMT introduces reconfigurability:
  - field definitions can be altered and new fields added
  - the number, topology, widths, and depths of match tables *can be specified*, subject only to an overall resource limit on the number of matched bits
  - o new actions may be defined, such as writing new header fields
  - o arbitrarily modified packets can be placed in specified queue(s), for output at any subset of ports, with a queuing discipline specified for each queue
- Configuration managed by an SDN controller

| Mac src | Mac<br>dst        | Eth type | Action       |
|---------|-------------------|----------|--------------|
| *       | 68:5b:35:83:c8:56 | *        | Goto Table 2 |
| ***     |                   |          |              |
|         | Aa:bb:00:cc:00:aa | *        | Goto Table   |

Table 1

| IP<br>src    | IP<br>dst    | Action             |
|--------------|--------------|--------------------|
| •            | *            | Forward to Port P2 |
| •            | 131.114.52.* | Forward to Port P3 |
| 131.114.52.* | *            | Drop               |

Table 2







#### PISA logical architecture

Protocol Independent Switch Architecture (PISA)

- Programmable parser
  - yields a packet header vector
    - header fields + metadata
- Programmable match-action pipeline
  - o Packet header vectors flows through a sequence of logical match stages that run in series or in parallel
- Programmable deparser
  - Recompose the packet be serializing headers in the desired order









## P4 Language design principles

- Reconfigurability
  - the controller should be able to reconfigure the switch on live
- Protocol independence
  - the controller defines arbitrary packet formats through
    - custom parser
    - custom MATs
- Target independence
  - o low level details not exposed to the programmer
  - the P4 compiler takes care of translating a target independent program into a target dependent one
  - which targets? Domain Specific Processors
    - some switch ASIC → PISA (e.g., 12.8 Tbps Intel Tofino 2, etc.)
    - some NPUs
    - NetFPGAs
    - CPUs (sw switches)
- P4 used to program the switch, OF or any other south bound interface can be be used to populate the tables
- P4 language consortium <a href="https://p4.org/">https://p4.org/</a>
- Specifications: P4<sub>14</sub> (2018) and P4<sub>16</sub> (2019)



Source: CCR P4 paper (2014)







#### PISA: Protocol-Independent Switch Architecture











# **Packet Processing in PISA**



Source: p4.org







#### P4 targets vs. architectures

- P4<sub>14</sub> targeted PISA like switches
- P4<sub>16</sub> extends the support to multiple programmable devices (targets) through the notion of architecture
- P4 Architecture
  - o is a programming model (a programming abstraction)
  - the architecture is what the programmer sees, a logical view of the processing
  - hide the underlying hw details to the programmer
  - provides an interface to program a target via some set of P4-programmable components, externs,
     fixed components
- Device vendors should provide compiler and architecture for their target







#### P4 architectures

The architecture is what the programmer sees, a logical view of the processing hide the underlying hw details to the programmer

#### my\_program.p4

Written against a specific architecture Defines the processing of each block



**architecture.p4** Provided by switch vendor

Defines which blocks are available, the interfaces of each block, and their capabilities







# **Architectures and targets**



















#### **Portable Switch Architecture (PSA)**



















**Anything** 







#### The v1Model architecture for the BMv2 soft switch

- Implemented in Bmv2's simple\_switch target (not the only possible!!)
- Very similar to the original PISA architecture







# P4: programming a target











#### **P4 Programming**

- BMv2 Simple Switch
- v1model architecture
  - standard and intrinsic metadata
  - extern "specialized" functions
- P4 code will contain sections for:
  - Parsing
  - Control flow
    - ingress pipeline
    - extern pipeline
  - Deparsing



Source: p4.org

But we need first to setup the development environment...







# A real running testbed?

- Not very practical...
  - Need a real P4 switch and too many PCs...









# A real running testbed?

- Again, not very practical...
  - Have not many laptops...









## P4 (virtual) running environment







#### What we need

- A Linux machine with:
  - The BMv2 software switch
  - o **p4c**: the reference P4 compiler
  - Mininet: a lightweight network emulation environment
- As lot of dependencies need to be installed, I suggest downloading a VM with everything in it:
  - Tutorial from P4 official page:
    - https://github.com/p4lang/tutorials
    - lots of exercises in there
  - github repository of examples presented today
    - https://github.com/grp-xx/5G-SummerSchool-2022







#### p4 program structure







#### v1Model standard\_metadata

```
struct standard metadata t {
   bit<9> ingress_port; <-</pre>
   bit<9> egress_spec; -
                                                              The arrival port of the packet
   bit<9> egress port; _
   bit<32> clone spec;
   bit<32> instance type;
                                                              The port the packet should be sent to
   bit<1> drop;
   bit<16> recirculate port;
   bit<32> packet length;
   bit<32> eng timestamp;
                                                              The departure port (can be read in
   bit<19> enq qdepth;
                                                              the egress pipeline only)
   bit<32> deg timedelta;
   bit<19> deq qdepth;
   bit<48> ingress_global_timestamp;
   bit<32> If field list;
   bit<16> mcast grp;
   bit<1> resubmit flag;
   bit<16> egress rid;
   bit<1> checksum_error;
```





#### P4 Types

- Statically typed language with
  - base types
  - composed types

```
bool
                                 header Ethernet h {
bit<W>
                                  bit<48> dstAddr;
int<W>
                                  bit<48> srcAddr;
varbit<W>
                                  bit<16> etherType;
enum
typedef bit<48> macAddr t;
                                 Ethernet h etherHeader;
no float, no strings!
   struct standard metadata t {
                                        tuple<bit<32>, bool> x;
    bit<9> ingress port;
                                        x = \{ 10, false \};
    bit<9> egress spec;
    bit<9> egress port;
   . . . }
```

```
header Mpls_h {
bit<20> label;
bit<3> tc;
bit bos;
bit<8> ttl:
Mpls h[10] mpls;
header union IP h {
 IPv4 h v4;
 IPv6 h v6;
IP h ipHeader;
```







## P4 operations

- Arithmetic
  - 0 +, -, \* / \* no division!!! \*/
- Logical

- Bit-slicing [m:l]
- Bit concatenation ++
- No modulo operation!!!







#### P4 Variables and constants

- Variables
  - Have local scope
  - o not maintained upon the next invocation
  - o cannot be used to save states!
    - use tables and extern for that

```
bit<8> x = 10;
typedef bit<16> TcpPort;
TcpPort s port
```

**s port** = 10000;

#### Constants

```
const bit<8> x = 10;
typedef bit<16> TcpPort;
const TcpPort s_port = 10000;
```







#### P4 main statements

- return
  - terminate action or control in which it is contained
- exit
  - terminate of all blocks in execution
- conditions
  - cannot be used in parsers

```
if (x==100) {...} else {...}
```

- switch
  - o can be used in control blocks only

Forget about loops in P4!!!

```
switch (hdr.ethernet.etherType) {
    0x86dd: { /* body omitted */ }
    0x0800: { /* body omitted */ }
    0x0802: { /* body omitted */ }
    0xcafe: { /* body omitted */ }
    default: { /* body omitted */ }
}
```







#### P4 parser

- Finite State Machine
  - states
    - predefined start, accept, reject
    - + user defined ones
  - transitions
  - o loops are allowed (e.g. tunneling)
  - more advanced methods: verify, lookahead,...

```
parse_ethernet
parse_ipv4

accept reject
```

```
parser MyParser(packet_in packet,
         out headers hdr,
         inout metadata meta,
         inout standard_metadata_t std_meta) {
     state start {
       transition parse ethernet;
     state parse ethernet {
       packet.extract(hdr.ethernet);
       transition select(hdr.ethernet.etherType)
         0x800: parse ipv4;
          default: accept;
     state parse_ipv4 {
```





#### P4 control

- Control blocks implements:
  - match actions pipelines
  - deparser
  - o additional packet processing (e.g. checksum update)
- Control blocks contain:
  - Tables
    - match a key and return an action (and data)
  - Actions
    - pretty much like C functions, for software re-use
    - no return value, though
    - parameters have directions
      - in ReadOnly in the action
      - out to be written in the action
      - inout ReadWrite
  - Control flow
    - blocks of imperative code (without loops!)
    - may contain advanced concepts such as cloning packets, sending packets to control plane, etc.







#### P4 action and control flow

```
control MyIngress(inout headers hdr,
                  inout metadata meta,
                  inout standard metadata t std meta) {
 action swap mac(inout bit<48> src,
                  inout bit<48> dst) {
    bit<48> tmp = src;
    src = dst;
    dst = tmp;
 apply {
    swap mac(hdr.ethernet.srcAddr,
             hdr.ethernet.dstAddr);
    std meta.egress spec = std meta.ingress port;
```













#### P4 tables in the code...

```
Table Name
table | ipv4_lpm | {
    key = {
                                      Match type
      hdr.ipv4.dstAddr: lpm;
    actions = {
      ipv4_forward;
                                   Possible Actions
      drop;
      NoAction;
    size = 1024;
                                Max # of table entries
    default action = NoAction();
                                           Default action
```

```
/* core.p4 */
match kind {
    exact,
    ternary,
    1pm
/* v1model.p4 */
match_kind {
    range,
    selector
/* Some other architecture */
match kind {
```







Tables popula<u>te</u>

#### **Control plane**







#### **P4 Deparser**

Serializes headers in the desired order and rebuild the packet

```
ethernet
ipv4
tcp

payload
```







#### Checksum - Validation and update (1)

Extern defined in the v1model





#### Checksum - Validation and update (2)

```
control MyComputeChecksum(...) {
   apply {
     update checksum(
     hdr.ipv4.isValid(),
           hdr.ipv4.version,
           hdr.ipv4.ihl,
           hdr.ipv4.diffserv,
           hdr.ipv4.totalLen,
           hdr.ipv4.identification,
           hdr.ipv4.flags,
           hdr.ipv4.fragOffset,
           hdr.ipv4.ttl,
           hdr.ipv4.protocol,
           hdr.ipv4.srcAddr,
           hdr.ipv4.dstAddr },
           hdr.ipv4.hdrChecksum,
     HashAlgorithm.csum16);
```





### P4 stateful programming

- Stateless objects
  - variables
  - headers
- Stateful objects
  - tables
    - can only be populated by the control plane!
  - RegistersCountersexterns
  - Meters
    - used to limit traffic rate







### Registers

- Good to store arbitrary data (of type <Type>)
- read and write methods available
- assigned in arrays of length N









#### **Counters**

- Good for... counting ;-)
- Three types only:
  - packets, bytes, packets\_and\_bytes
- Only count method available (read available from the control plane)
- Assigned in arrays of length N



• there are also *direct counters*: special type of counters attached to tables







#### P4 Hello World: packet reflector

- Not easy to define a hello world for packet processing...
  - o can't print a string on video ;-)
- Packet reflector
  - the switch receives the ethernet frame and sends it back to the host (... swaps mac addresses)







#### Packet reflector implementation (1)

```
/* -*- P4 16 -*- */
#include <core.p4>
#include <v1model.p4>
/*** H E A D E R S ****/
typedef bit<48> macAddr t;
header ethernet t {
    macAddr t dstAddr;
    macAddr t srcAddr;
    bit<16> etherType;
struct metadata {
    /* empty */
struct headers {
       ethernet;
```

| Ethernet II     | 9          | 1 "1    |                 | 0.04%<br>50%          |
|-----------------|------------|---------|-----------------|-----------------------|
| Destination MAC | Source MAC | Type    | Data            | Frame Check Sequence: |
| 6 Bytes         | 6 Bytes    | 2 Bytes | 46 – 1500 Bytes | A Bytes               |







#### Packet reflector implementation (2)

```
/*** INGRESS PROCESSING ****/
control MyIngress(inout headers hdr,
                 inout metadata meta,
                 inout standard metadata t standard metadata) {
   action swap macs(inout macAddr t src, inout macAddr t dst) {
       macAddr t tmp = src;
       src = dst:
       dst = tmp;
   apply {
      // Swap MAC addresses.
      swap macs(hdr.ethernet.srcAddr, hdr.ethernet.dstAddr);
      //Set Output port == Input port
      standard metadata.egress spec = standard metadata.ingress port;
```

```
/*** S W I T C H ****/
V1Switch(
        MyParser(),
        MyVerifyChecksum(),
        MyIngress(),
        MyEgress(),
        MyComputeChecksum(),
        MyDeparser()
) main;
```







#### Reflect one out of two? Count reflected frames?

```
/*** INGRESS PROCESSING ****/
control MyIngress(inout headers hdr,
                 inout metadata meta,
                 inout standard_metadata_t standard_metadata) {
   action swap macs(...){ }
   register<bit<1>>(8) reg;
   apply {
       bit<1> flag;
       bit<32> input port = (bit<32>) standard metadata.ingress port;
       reg.read(flag,input port);
       reg.write(input port,flag+1);
       if (flag == 1) {
           mark_to_drop(standard_metadata);
       else {
            swap macs(hdr.ethernet.srcAddr, hdr.ethernet.dstAddr);
            standard metadata.egress spec = standard metadata.ingress port;
```

 How do I access registers and counters from the control plane?

```
simple_switch_CLI --thrift-port 9090
> register_read...
> counter_read...
```





### And finally using tables...

```
control MyIngress(inout headers hdr,
                  inout metadata meta,
                  inout standard metadata t
                                      standard metadata) {
    register<bit<1>>(8) reg;
    bit<1> flag;
    action swap macs(inout macAddr t src, inout macAddr t dst)
       macAddr t tmp = src;
        src = dst;
        dst = tmp;
    action frame reflect() {
        swap macs();
        standard_metadata.egress_spec =
                           standard metadata.ingress port;
    action drop() {
        mark to drop(standard metadata);
```

```
table odd even {
    key = {
        flag: exact;
    actions = {
        frame reflect;
        drop;
    size = 8;
    default_action = drop();
apply {
    bit<32> input port = (bit<32>)
                      standard metadata.ingress port;
   reg.read(flag,input port);
    reg.write(input port,flag+1);
    odd even.apply();
```

To populate the table...simple\_switch\_CLI --thrift-port 9090table\_add odd\_even frame\_reflect 0 =>







#### Add your own header...

A simple (and useless) example: add the input port number after the ethernet header

In the declaration section

```
header extra_t {
    bit<16> in_port;
}

struct headers {
    extra_t extra;
}
In the MyIngress control block

hdr.extra.setValid();
hdr.extra.in_port = (bit<16>) std_meta.ingress_port;
In the deparser

control MyDeparser(packet_out packet, in headers hdr) {
    apply {
        packet.emit(hdr.ethernet);
        packet.emit(hdr.extra);
    }
}
```







### L2 Switching

- Switching table provided through the control plane
- Table populated through the simple\_switch\_CLI (gRPC client)









#### **L2 Switching - Solution**

```
table out iface {
        key = {
            hdr.ethernet.dstAddr: exact;
        actions = {
            drop;
            eth forward;
        size = 8:
        default action = drop();
    apply {
        out iface.apply();
```

```
• To populate the table...
simple_switch_CLI --thrift-port 9090
> table_add out_iface eth_forward 00:00:0a:00:00:01 => 1
> ...
> table_add out_iface eth_forward 00:00:0a:00:00:04 => 4
```







#### L2 Switching with Access Control List (ACL)

- Switching table provided through the control plane
- Table populated through the simple\_switch\_CLI (gRPC client)
- Allow forwarding only from mac address included in ACL









#### **L2 Switching with ACL - Solution**

```
control MyEgress(inout headers hdr,
                  inout metadata meta.
                  inout standard_metadata_t standard_metadata) {
   action block() {
        mark to drop(standard metadata);
   table acl {
   key = {
            hdr.ethernet.srcAddr: exact;
   actions = {
               block;
               NoAction;
   default action = block();
    size = 8;
   apply {
          acl.apply();
```

```
table out_iface {
        key = {
            hdr.ethernet.dstAddr: exact;
        actions = {
            drop;
            eth forward;
        size = 8:
        default action = drop();
    apply {
        out iface.apply();
```

```
To populate the table...
simple_switch_CLI --thrift-port 9090
> table_add acl NoAction 00:00:0a:00:00:02 =>
> table_add acl NoAction 00:00:0a:00:00:03 =>
```







### L3 Forwarding

- Forwarding table provided via the control plane
- Table populated through the simple\_switch\_CLI (gRPC client)







### **L3-forwarding - Headers**

```
const bit<16> TYPE IPV4 = 0x800;
typedef bit<9> egressSpec t;
typedef bit<48> macAddr t;
typedef bit<32> ip4Addr t;
header ethernet t {
   macAddr t dstAddr;
   macAddr t srcAddr;
   bit<16> etherType;
header ipv4 t {
   bit<4>
          version:
   bit<4> ihl;
   bit<8> diffserv;
   bit<16> totalLen;
   bit<16> identification;
   bit<3> flags;
            fragOffset;
   bit<13>
   bit<8> ttl;
   bit<8> protocol;
   bit<16> hdrChecksum:
   ip4Addr t srcAddr;
   ip4Addr t dstAddr;
```

```
struct metadata {
    /* empty */
}

struct headers {
    ethernet_t ethernet;
    ipv4_t ipv4;
}
```







#### **L3-forwarding - Parser**

```
parser MyParser(packet_in packet,
                out headers hdr,
                inout metadata meta,
                inout standard_metadata_t standard_metadata) {
    state start {
        packet.extract(hdr.ethernet);
        transition select(hdr.ethernet.etherType){
            TYPE_IPV4: ipv4; //TYPE_PIV4 = 0x0800
            default: accept;
    state ipv4 {
        packet.extract(hdr.ipv4);
        transition accept;
```







#### **L3-forwarding - Control flow**

```
control MyIngress(inout headers hdr,
                  inout metadata meta,
                  inout standard_metadata_t standard_metadata) {
    action drop() {
       mark to drop(standard metadata);
   action ipv4_forward(macAddr_t d_mac, egressSpec_t port){
       hdr.ethernet.dstAddr = d mac;
       standard metadata.egress spec = port;
       hdr.ipv4.ttl = hdr.ipv4.ttl - 1;
```

```
table ipv4_lpm {
        key = {
            hdr.ipv4.dstAddr: lpm;
        actions = {
            ipv4 forward;
            drop;
            NoAction;
        size = 1024;
        default action = NoAction();
    apply {
        if ( hdr.ipv4.isValid() ) {
            ipv4_lpm.apply();
```







## L3-forwarding - Egress processing

```
control MyEgress(inout headers hdr,
                 inout metadata meta,
                 inout standard metadata t standard metadata) {
    action drop() {
       mark to drop(standard metadata);
    action set smac(macAddr t mac) {
       hdr.ethernet.srcAddr = mac;
    table s mac {
       kev = {
            standard metadata.egress port: exact;
       actions = {
            set smac;
            drop;
            NoAction;
       size = 16;
       default_action = NoAction();
```

```
apply {
    s_mac.apply();
}
```







### L3 forwarding - Checksum and deparser

```
control MyComputeChecksum(inout headers hdr, inout metadata meta)
     apply {
          update checksum(
          hdr.ipv4.isValid(),
            { hdr.ipv4.version,
            hdr.ipv4.ihl,
              hdr.ipv4.diffserv,
              hdr.ipv4.totalLen,
              hdr.ipv4.identification,
              hdr.ipv4.flags,
              hdr.ipv4.fragOffset,
              hdr.ipv4.ttl,
              hdr.ipv4.protocol,
              hdr.ipv4.srcAddr,
              hdr.ipv4.dstAddr },
            hdr.ipv4.hdrChecksum,
            HashAlgorithm.csum16);
```

```
control MyDeparser(packet_out packet, in headers hdr) {
    apply {
        packet.emit(hdr.ethernet);
        packet.emit(hdr.ipv4);
    }
}
```

```
//switch architecture
V1Switch(
MyParser(),
MyVerifyChecksum(),
MyIngress(),
MyEgress(),
MyComputeChecksum(),
MyDeparser()
) main;
```





## L3 Forwarding: populating the tables

> simple switch CLI --thrift-port 9090 < s1-commands.txt

```
> simple switch CLI --thrift-port 9091 < s2-commands.txt</pre>
# s1-commands.txt
table clear ipv4 lpm
table add ipv4 lpm ipv4 forward 10.0.1.11/32 => ca:fe:ba:be:ca:fe 1
table add ipv4 lpm ipv4 forward 10.0.3.0/24 => cc:cc:cc:cc:cc 2
table clear s mac
table add s mac set smac 1 => aa:aa:aa:aa:aa:aa
table add s mac set smac 2 => bb:bb:bb:bb:bb
# s2-commands.txt
table clear ipv4 lpm
table add ipv4 lpm ipv4 forward 10.0.1.0/24 => bb:bb:bb:bb:bb:bb 2
table add ipv4 lpm ipv4 forward 10.0.3.22/32 => de:ad:be:ef:fe:ed 1
table clear s mac
table add s mac set smac 1 => dd:dd:dd:dd:dd
table add s mac set smac 2 => cc:cc:cc:cc:cc
```





#### IP traffic probe and filter

- The switch normally forward frames among h1, h2 and h3
- in addition, it "mirrors" (clone) every IP packet to the probe
  - Traffic to the probe is further filtered by *protocol* (in the IP header)









#### **IP probe - Headers**

```
const bit<16> TYPE IPV4 = 0x800;
typedef bit<48> macAddr t;
typedef bit<32> ip4Addr t;
header ethernet_t {
   macAddr t dstAddr;
   macAddr t srcAddr;
   bit<16> etherType;
header ipv4 t {
   bit<4> version;
   bit<4>
          ihl;
   bit<8> diffserv:
   bit<16> totalLen;
   bit<16> identification;
   bit<3> flags;
   bit<13> fragOffset;
   bit<8>
            ttl;
   bit<8>
             protocol;
   bit<16> hdrChecksum;
   ip4Addr t srcAddr;
   ip4Addr t dstAddr;
```

```
struct metadata {
    /* empty */
}

struct headers {
    ethernet_t ethernet;
    ipv4_t ipv4;
}
```







#### IP probe - Parser

```
parser MyParser(packet_in packet,
                out headers hdr,
                inout metadata meta,
                inout standard_metadata_t standard_metadata) {
   state start {
       packet.extract(hdr.ethernet);
       transition select(hdr.ethernet.etherType){
            TYPE_IPV4: ipv4; //TYPE_PIV4 = 0x0800
            default: accept;
    state ipv4 {
       packet.extract(hdr.ipv4);
       transition accept;
```





#### IP probe - Control flow

```
control MyIngress(inout headers hdr,
                  inout metadata meta,
                  inout standard metadata t standard metadata) {
    action drop() {
       mark to drop(standard metadata);
    action eth forward(bit<9> out port) {
       standard metadata.egress spec = out port;
    table out iface {
       key = {
            hdr.ethernet.dstAddr: exact;
       actions = {
            drop;
            eth forward;
       size = 8;
       default action = drop();
    apply {
       out iface.apply();
```

```
action pkt clone(bit<32> mirror id) {
        clone(CloneType.I2E, mirror id);
table monitor {
        key = {
            hdr.ipv4.protocol: exact;
        actions = {
            NoAction;
            pkt clone;
        size = 8:
        default action = NoAction();
    apply {
        if ( hdr.ipv4.isValid() ) {
            monitor.apply();
       out iface.apply();
```



1





#### IP probe: populating the tables

> simple\_switch\_CLI --thrift-port 9090 < s1-commands.txt</pre>

```
# s1-commands.txt

table_add out_iface eth_forward 00:00:0a:00:00:01 => 1
table_add out_iface eth_forward 00:00:0a:00:00:02 => 2
table_add out_iface eth_forward 00:00:0a:00:00:03 => 3

mirroring_add 100 4
table_add monitor pkt_clone 1 => 100
table_add monitor pkt_clone 6 => 100
```







# **Part II**

Giuseppe Lettieri

End host data plane programmability







#### eBPF and XDP

We are going to see a few practical examples.

If you want to replicate the examples on your own:

- If you already have Linux, download the 5gss-xpd.tar.gz tarball
- Or, download the VM (VirtualBox): <a href="https://calcolatori.iet.unipi.it/xdp.ova">https://calcolatori.iet.unipi.it/xdp.ova</a>

The tarball and VM also contain an handout (notes.pdf) for this part of the lecture.

